489 research outputs found
The Functional Consequences of Variation in Transcription Factor Binding
One goal of human genetics is to understand how the information for precise
and dynamic gene expression programs is encoded in the genome. The interactions
of transcription factors (TFs) with DNA regulatory elements clearly play an
important role in determining gene expression outputs, yet the regulatory logic
underlying functional transcription factor binding is poorly understood. Many
studies have focused on characterizing the genomic locations of TF binding, yet
it is unclear to what extent TF binding at any specific locus has functional
consequences with respect to gene expression output. To evaluate the context of
functional TF binding we knocked down 59 TFs and chromatin modifiers in one
HapMap lymphoblastoid cell line. We then identified genes whose expression was
affected by the knockdowns. We intersected the gene expression data with
transcription factor binding data (based on ChIP-seq and DNase-seq) within 10
kb of the transcription start sites of expressed genes. This combination of
data allowed us to infer functional TF binding. On average, 14.7% of genes
bound by a factor were differentially expressed following the knockdown of that
factor, suggesting that most interactions between TF and chromatin do not
result in measurable changes in gene expression levels of putative target
genes. We found that functional TF binding is enriched in regulatory elements
that harbor a large number of TF binding sites, at sites with predicted higher
binding affinity, and at sites that are enriched in genomic regions annotated
as active enhancers.Comment: 30 pages, 6 figures (7 supplemental figures and 6 supplemental tables
available upon request to [email protected]). Submitted to PLoS
Genetic
All-sky signals from recombination to reionization with the SKA
Cosmic evolution in the hydrogen content of the Universe through
recombination and up to the end of reionization is expected to be revealed as
subtle spectral features in the uniform extragalactic cosmic radio background.
The redshift evolution in the excitation temperature of the 21-cm spin flip
transition of neutral hydrogen appears as redshifted emission and absorption
against the cosmic microwave background. The precise signature of the spectral
trace from cosmic dawn and the epoch of reionization are dependent on the
spectral radiance, abundance and distribution of the first bound systems of
stars and early galaxies, which govern the evolution in the spin-flip level
populations. Redshifted 21 cm from these epochs when the spin temperature
deviates from the temperature of the ambient relic cosmic microwave background
results in an all-sky spectral structure in the 40-200 MHz range, almost wholly
within the band of SKA-Low. Another spectral structure from gas evolution is
redshifted recombination lines from epoch of recombination of hydrogen and
helium; the weak all-sky spectral structure arising from this event is best
detected at the upper end of the 350-3050 MHz band of SKA-mid. Total power
spectra of SKA interferometer elements form the measurement set for these faint
signals from recombination and reionization; the inter-element interferometer
visibilities form a calibration set. The challenge is in precision polarimetric
calibration of the element spectral response and solving for additives and
unwanted confusing leakages of sky angular structure modes into spectral modes.
Herein we discuss observing methods and design requirements that make possible
these all-sky SKA measurements of the cosmic evolution of hydrogen.Comment: Accepted for publication in the SKA Science Book 'Advancing
Astrophysics with the Square Kilometre Array', to appear in 201
Recommended from our members
Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data
Many aspects of the historical relationships between populations in a species are reflected in genetic data. Inferring these relationships from genetic data, however, remains a challenging task. In this paper, we present a statistical model for inferring the patterns of population splits and mixtures in multiple populations. In our model, the sampled populations in a species are related to their common ancestor through a graph of ancestral populations. Using genome-wide allele frequency data and a Gaussian approximation to genetic drift, we infer the structure of this graph. We applied this method to a set of 55 human populations and a set of 82 dog breeds and wild canids. In both species, we show that a simple bifurcating tree does not fully describe the data; in contrast, we infer many migration events. While some of the migration events that we find have been detected previously, many have not. For example, in the human data, we infer that Cambodians trace approximately 16% of their ancestry to a population ancestral to other extant East Asian populations. In the dog data, we infer that both the boxer and basenji trace a considerable fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to domestication and that East Asian toy breeds (the Shih Tzu and the Pekingese) result from admixture between modern toy breeds and “ancient” Asian breeds. Software implementing the model described here, called TreeMix, is available at http://treemix.googlecode.com.</p
Assessing the Performance of the Haplotype Block Model of Linkage Disequilibrium
Several recent studies have suggested that linkage disequilibrium (LD) in the human genome has a fundamentally “blocklike” structure. However, thus far there has been little formal assessment of how well the haplotype block model captures the underlying structure of LD. Here we propose quantitative criteria for assessing how blocklike LD is and apply these criteria to both real and simulated data. Analyses of several large data sets indicate that real data show a partial fit to the haplotype block model; some regions conform quite well, whereas others do not. Some improvement could be obtained by genotyping higher marker densities but not by increasing the number of samples. Nonetheless, although the real data are only moderately blocklike, our simulations indicate that, under a model of uniform recombination, the structure of LD would actually fit the block model much less well. Simulations of a model in which much of the recombination occurs in narrow hotspots provide a much better fit to the observed patterns of LD, suggesting that there is extensive fine-scale variation in recombination rates across the human genome
Recommended from our members
Adaptive Evolution of Conserved Noncoding Elements in Mammals
Conserved noncoding elements (CNCs) are an abundant feature of vertebrate genomes. Some CNCs have been shown to act as cis-regulatory modules, but the function of most CNCs remains unclear. To study the evolution of CNCs, we have developed a statistical method called the “shared rates test” to identify CNCs that show significant variation in substitution rates across branches of a phylogenetic tree. We report an application of this method to alignments of 98,910 CNCs from the human, chimpanzee, dog, mouse, and rat genomes. We find that ∼68% of CNCs evolve according to a null model where, for each CNC, a single parameter models the level of constraint acting throughout the phylogeny linking these five species. The remaining ∼32% of CNCs show departures from the basic model including speed-ups and slow-downs on particular branches and occasionally multiple rate changes on different branches. We find that a subset of the significant CNCs have evolved significantly faster than the local neutral rate on a particular branch, providing strong evidence for adaptive evolution in these CNCs. The distribution of these signals on the phylogeny suggests that adaptive evolution of CNCs occurs in occasional short bursts of evolution. Our analyses suggest a large set of promising targets for future functional studies of adaptation.</p
The Genetic and Mechanistic Basis for Variation in Gene Regulation
It is now well established that noncoding regulatory variants play a central role in the genetics of common diseases and in evolution. However, until recently, we have known little about the mechanisms by which most regulatory variants act. For instance, what types of functional elements in DNA, RNA, or proteins are most often affected by regulatory variants? Which stages of gene regulation are typically altered? How can we predict which variants are most likely to impact regulation in a given cell type? Recent studies, in many cases using quantitative trait loci (QTL)-mapping approaches in cell lines or tissue samples, have provided us with considerable insight into the properties of genetic loci that have regulatory roles. Such studies have uncovered novel biochemical regulatory interactions and led to the identification of previously unrecognized regulatory mechanisms. We have learned that genetic variation is often directly associated with variation in regulatory activities (namely, we can map regulatory QTLs, not just expression QTLs [eQTLs]), and we have taken the first steps towards understanding the causal order of regulatory events (for example, the role of pioneer transcription factors). Yet, in most cases, we still do not know how to interpret overlapping combinations of regulatory interactions, and we are still far from being able to predict how variation in regulatory mechanisms is propagated through a chain of interactions to eventually result in changes in gene expression profiles.National Institutes of Health (U.S.) (grant NIH HG006123)National Institutes of Health (U.S.) (NIH GM007197)National Institutes of Health (U.S.) (grant NIH MH084703)Howard Hughes Medical InstituteJane Coffin Childs Memorial Fund for Medical Research (postdoctoral fellowship
Confounding from Cryptic Relatedness in Case-Control Association Studies
Case-control association studies are widely used in the search for genetic variants that contribute to human diseases. It has long been known that such studies may suffer from high rates of false positives if there is unrecognized population structure. It is perhaps less widely appreciated that so-called “cryptic relatedness” (i.e., kinship among the cases or controls that is not known to the investigator) might also potentially inflate the false positive rate. Until now there has been little work to assess how serious this problem is likely to be in practice. In this paper, we develop a formal model of cryptic relatedness, and study its impact on association studies. We provide simple expressions that predict the extent of confounding due to cryptic relatedness. Surprisingly, these expressions are functions of directly observable parameters. Our analytical results show that, for well-designed studies in outbred populations, the degree of confounding due to cryptic relatedness will usually be negligible. However, in contrast, studies where there is a sampling bias toward collecting relatives may indeed suffer from excessive rates of false positives. Furthermore, cryptic relatedness may be a serious concern in founder populations that have grown rapidly and recently from a small size. As an example, we analyze the impact of excess relatedness among cases for six phenotypes measured in the Hutterite population
Recommended from our members
The Effect of Freeze-Thaw Cycles on Gene Expression Levels in Lymphoblastoid Cell Lines
Epstein-Barr virus (EBV) transformed lymphoblastoid cell lines (LCLs) are a widely used renewable resource for functional genomic studies in humans. The ability to accumulate multidimensional data pertaining to the same individual cell lines, from complete genomic sequences to detailed gene regulatory profiles, further enhances the utility of LCLs as a model system. However, the extent to which LCLs are a faithful model system is relatively unknown. We have previously shown that gene expression profiles of newly established LCLs maintain a strong individual component. Here, we extend our study to investigate the effect of freeze-thaw cycles on gene expression patterns in mature LCLs, especially in the context of inter-individual variation in gene expression. We report a profound difference in the gene expression profiles of newly established and mature LCLs. Once newly established LCLs undergo a freeze-thaw cycle, the individual specific gene expression signatures become much less pronounced as the gene expression levels in LCLs from different individuals converge to a more uniform profile, which reflects a mature transformed B cell phenotype. We found that previously identified eQTLs are enriched among the relatively few genes whose regulations in mature LCLs maintain marked individual signatures. We thus conclude that while insight drawn from gene regulatory studies in mature LCLs may generally not be affected by the artificial nature of the LCL model system, many aspects of primary B cell biology cannot be observed and studied in mature LCL cultures.</p
- …